Detection of Eukaryotic Promoter Regions Using Stochastic Language Models

نویسندگان

  • Uwe Ohler
  • Martin G. Reese
چکیده

We present a new search-by-content method to identify transcriptional regulatory regions in eukaryotic genomic sequences. The method is based on stochastic language models which are a straightforward generalization of oligomer statistics. We describe the theoretical background and different parameter estimation techniques used to build the models. The resulting language models are applied to classify fixed length sequences into the classes of promoters and non-promoters, and to search for transcription start sites in contiguous sequences. Detailed classification results for human and Drosophila data sets are presented, and the practical applicability of the models is demonstrated on an independent test set of vertebrate genomic sequences. On this set, which has already been used to compare different computational approaches for promoter recognition, the performance of our method is comparable to the best algorithms described so far. The number of false positives can be further reduced by a post processing step on the output scores. Examining both strands of the independent test set, the models thus are able to identify about half of the annotated transcription start sites (12 out of 22) while making a false prediction roughly every 800 base pairs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic segment models of eukaryotic promoter regions.

We present a new statistical approach for eukaryotic polymerase II promoter recognition. We apply stochastic segment models in which each state represents a functional part of the promoter. The segments are trained in an unsupervised way. We compare segment models with three and five states with our previous system which modeled the promoters as a whole, i.e. as a single state. Results on the c...

متن کامل

Detection of Outliers and Influential Observations in Linear Ridge Measurement Error Models with Stochastic Linear Restrictions

The aim of this paper is to propose some diagnostic methods in linear ridge measurement error models with stochastic linear restrictions using the corrected likelihood. Based on the bias-corrected estimation of model parameters, diagnostic measures are developed to identify outlying and influential observations. In addition, we derive the corrected score test statistic for outliers detection ba...

متن کامل

Modeling and Evaluation of Stochastic Discrete-Event Systems with RayLang Formalism

In recent years, formal methods have been used as an important tool for performance evaluation and verification of a wide range of systems. In the view points of engineers and practitioners, however, there are still some major difficulties in using formal methods. In this paper, we introduce a new formal modeling language to fill the gaps between object-oriented programming languages (OOPLs) us...

متن کامل

Modeling and Evaluation of Stochastic Discrete-Event Systems with RayLang Formalism

In recent years, formal methods have been used as an important tool for performance evaluation and verification of a wide range of systems. In the view points of engineers and practitioners, however, there are still some major difficulties in using formal methods. In this paper, we introduce a new formal modeling language to fill the gaps between object-oriented programming languages (OOPLs) us...

متن کامل

Salient regions detection in satellite images using the combination of MSER local features detector and saliency models

Nowadays, due to quality development of satellite images, automatic target detection on these images has been attracted many researchers' attention. Remote-sensing images follow various geospatial targets; these targets are generally man-made and have a distinctive structure from their surrounding areas. Different methods have been developed for automatic target detection.  In most of these met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998